Dependency vulnerability scanner powered by OSV.dev and your choice of LLM.
Upload a
requirements.txtorpackage.json. DepWatch queries every pinned dependency against the OSV.dev vulnerability database, then sends the raw findings to an AI provider — Anthropic Claude or Alibaba Qwen — which explains each CVE in plain English, constructs a realistic exploit scenario, and recommends a specific remediation. Results are ranked by urgency and persisted to Postgres so you can track a project's vulnerability profile over time.
- Features
- Screenshots
- How It Works
- Tech Stack
- Project Structure
- Supported File Formats
- AI Providers
- OSV.dev — Data Source
- Prerequisites
- Quick Start — Docker Compose
- Local Development — Without Docker
- Environment Variables Reference
- API Reference
- Running Tests
- CI / GitHub Actions
- Database & Migrations
- Design Decisions
- Troubleshooting
- Contributing
- Licence
| Feature | Detail |
|---|---|
| File upload | requirements.txt (Python/pip) and package.json (Node/npm) |
| Batch OSV queries | All dependencies checked in a single HTTP call — no API key required |
| AI enrichment | Plain-English CVE explanation, exploit scenario, and version-specific fix |
| Dual AI provider | Switch between Anthropic Claude and Alibaba Qwen with one env var |
| Urgency ranking | Every CVE classified as Immediate, Soon, or Low Priority |
| Severity scoring | CVSS scores extracted from OSV data; Critical / High / Medium / Low labels |
| Summary dashboard | At-a-glance counts: total deps, vulnerable packages, breakdown by urgency and severity |
| Sortable table | Vulnerabilities sorted by urgency then CVSS score; each row expands to show full detail |
| Scan history | Every scan persisted to Postgres; full reports accessible at any time |
| Graceful degradation | If the AI provider is unreachable, raw OSV data is returned with auto-generated stubs — the scan never silently fails |
| Docker Compose | One command boots Postgres + FastAPI backend + Reflex frontend |
| GitHub Actions CI | pytest + Ruff on every push; runs fully offline (SQLite, no network) |
| Upload screen | Scan report | History |
|---|---|---|
![]() |
![]() |
![]() |
┌──────────────────────────────────────────────────────────────────────────────┐
│ Browser · Reflex UI (port 3000) │
│ Pure Python → compiles to Vite + React at build time │
└─────────────────────────────────┬────────────────────────────────────────────┘
│ HTTP POST multipart/form-data
▼
┌──────────────────────────────────────────────────────────────────────────────┐
│ FastAPI backend (port 8000) │
│ │
│ POST /api/v1/scan/ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ 1. Receive UploadFile (≤ 5 MB) │ │
│ │ 2. Detect type from filename (.txt → requirements, .json → npm) │ │
│ │ 3. Parse → extract pinned (==) dependencies only │ │
│ │ 4. POST /v1/querybatch → OSV.dev ──────────────────────────────────────► │
│ │ 5. Flatten CVEs → batch prompt → AI provider ◄──────────────────────── │
│ │ (Anthropic claude-sonnet-4-20250514 OR Qwen qwen-plus) │ │
│ │ 6. Persist Scan + Vulnerability rows to Postgres │ │
│ │ 7. Return ScanResponse JSON │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ GET /api/v1/history/ paginated scan list │
│ GET /api/v1/history/{id} full report for a past scan │
│ GET /health liveness probe │
└──────────────────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────┐
│ PostgreSQL 17 │
│ scans │
│ vulnerabilities │
└───────────────────────┘
-
Parse —
app/parsers/requirements.pyorapp/parsers/package_json.pyextracts(name, version, ecosystem)tuples. Only exact-version pins (==/ bare semver) are retained; ranges, VCS URLs, and wildcards are skipped and returned to the caller asskipped_lines. -
OSV query —
app/services/osv_client.pybuilds a singlePOST /v1/querybatchpayload. OSV returns vulnerability lists in the same order as the query, so order is always preserved. Chunks of 100 are used as a safety margin below OSV's 1 000-item batch limit. -
AI enrichment —
app/services/ai_enrichment.pyflattens all(dep, [OsvVulnerability])pairs into a JSON array and sends it to the configured provider in a single call (sub-batched at 50 CVEs to respect context limits). The system prompt — defined as a constant inapp/services/prompts.py— instructs the model to return a strict JSON array matching theVulnerabilityResultschema. Three fallback levels handle malformed responses: direct parse → embedded-array extraction → stub generation. -
Persist — A
Scanrow and oneVulnerabilityrow per CVE are written inside the sameget_dbsession. The commit is handled by theget_dbdependency, not the route handler. -
Respond — The route returns a
ScanResponsewith the full vulnerability list, summary statistics, and scan metadata.
| Layer | Library | Version | Notes |
|---|---|---|---|
| Frontend | Reflex | 0.8.27 | Pure Python → Vite + React |
| Backend framework | FastAPI | 0.115.6 | Async, OpenAPI auto-docs |
| ASGI server | Uvicorn | 0.32.1 | With standard extras (watchfiles, httptools) |
| File upload | python-multipart | 0.0.20 | FastAPI UploadFile dependency |
| HTTP client | httpx | 0.28.1 | Async; used for OSV + Reflex→backend calls |
| Data validation | Pydantic v2 | 2.10.4 | Models for OSV responses and API shapes |
| Settings | pydantic-settings | 2.7.0 | .env loading with type coercion |
| ORM | SQLAlchemy | 2.0.36 | Async 2.0 style |
| DB driver | asyncpg | 0.30.0 | Async PostgreSQL driver |
| Migrations | Alembic | 1.14.0 | Async-aware env.py |
| Database | PostgreSQL | 17 (Alpine) | UUID PKs, timezone-aware timestamps |
| AI — Anthropic | anthropic | 0.43.0 | AsyncAnthropic client |
| AI — Qwen | openai | 1.58.3 | OpenAI SDK pointed at Dashscope |
| Testing | pytest | 8.3.4 | + pytest-asyncio 0.24.0, pytest-httpx 0.35.0 |
| Linting | Ruff | 0.9.1 | Lint + format check in CI |
| Containerisation | Docker Compose | v2 | Three services: db, backend, frontend |
| CI | GitHub Actions | — | Ubuntu latest, Python 3.12 |
depwatch/
│
├── .github/
│ └── workflows/
│ └── ci.yml # pytest + ruff on push / PR to main
│
├── backend/
│ ├── app/
│ │ ├── __init__.py
│ │ ├── config.py # pydantic-settings: all env vars + provider helpers
│ │ ├── database.py # async engine, AsyncSessionLocal, Base, get_db
│ │ ├── main.py # FastAPI app factory: lifespan, CORS, router mounts
│ │ │
│ │ ├── models/
│ │ │ ├── __init__.py
│ │ │ └── scan.py # Scan + Vulnerability SQLAlchemy ORM models
│ │ │
│ │ ├── parsers/
│ │ │ ├── __init__.py
│ │ │ ├── requirements.py # requirements.txt → DependencyItem list
│ │ │ └── package_json.py # package.json → DependencyItem list
│ │ │
│ │ ├── routers/
│ │ │ ├── __init__.py
│ │ │ ├── scan.py # POST /api/v1/scan/ — full scan pipeline
│ │ │ └── history.py # GET /api/v1/history/ + GET /api/v1/history/{id}
│ │ │
│ │ ├── schemas/
│ │ │ ├── __init__.py
│ │ │ ├── osv.py # Pydantic models mirroring OSV.dev API response
│ │ │ └── scan.py # API-layer request/response schemas
│ │ │
│ │ └── services/
│ │ ├── __init__.py
│ │ ├── prompts.py # VULNERABILITY_ENRICHMENT_SYSTEM_PROMPT constant
│ │ ├── osv_client.py # async httpx OSV.dev batch client
│ │ └── ai_enrichment.py # BaseEnrichmentService + Anthropic/Qwen subclasses
│ │
│ ├── alembic/
│ │ ├── env.py # async-aware Alembic environment
│ │ └── versions/
│ │ └── 0001_initial.py # scans + vulnerabilities tables + indexes
│ │
│ ├── tests/
│ │ ├── conftest.py # in-memory SQLite fixtures + httpx ASGI client
│ │ ├── test_parsers.py # 13 parser unit tests
│ │ ├── test_osv_client.py # 7 OSV client tests (httpx mocked)
│ │ ├── test_ai_enrichment.py # 24 enrichment tests (both providers + factory)
│ │ └── test_routes.py # 12 route integration tests
│ │
│ ├── alembic.ini
│ ├── Dockerfile # multi-stage build, non-root runtime user
│ ├── pytest.ini # asyncio_mode = auto
│ └── requirements.txt
│
├── frontend/
│ ├── depwatch/
│ │ ├── __init__.py
│ │ └── depwatch.py # complete Reflex app: AppState + all pages
│ ├── Dockerfile # Python 3.12 + Node 20 LTS
│ ├── requirements.txt
│ └── rxconfig.py # Reflex port config (frontend 3000, backend 3001)
│
├── .env.example # all variables documented with defaults
├── .gitignore
├── docker-compose.yml # db + backend + frontend with health checks
└── README.md
DepWatch extracts only pinned dependencies (the == operator). This is an intentional constraint: OSV.dev's /querybatch API requires an exact version string. Range specifiers like >=1.0,<2.0 cannot be resolved to a single version without running pip install; scanning them would produce unreliable results.
What is scanned:
flask==3.0.0
requests==2.31.0
uvicorn[standard]==0.32.1 # extras ([...]) are stripped — name becomes "uvicorn"
Django==4.2.0 ; python_version>="3.10" # environment markers stripped
What is skipped (returned in the API response as skipped_lines for transparency):
flask>=3.0.0 # range constraint
requests~=2.28 # compatible release
sqlalchemy # unpinned — no version at all
git+https://github.com/org/pkg # VCS URL
-r other-requirements.txt # include directive
--index-url https://pypi.org/ # option flag
https://example.com/pkg.whl # direct URL
DepWatch reads dependencies, devDependencies, and peerDependencies. Duplicate packages (appearing in more than one group) are de-duplicated, with the first occurrence winning.
What is scanned (exact semver only):
{
"dependencies": {
"express": "4.18.2",
"semver": "=7.5.4"
}
}What is skipped:
{
"dependencies": {
"lodash": "^4.17.21",
"axios": "~1.6.0",
"react": ">=18.0.0",
"my-lib": "file:../my-lib",
"workspace": "workspace:*",
"from-git": "github:user/repo"
}
}Tip: To maximise scan coverage, pin all your production dependencies. Run
pip freeze > requirements.txtornpm shrinkwrapto generate a fully-pinned lockfile suitable for DepWatch.
DepWatch supports two interchangeable AI providers, selected by a single environment variable. Both receive the same system prompt and return the same structured JSON — switching providers requires no code changes.
AI_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...
- Default model:
claude-sonnet-4-20250514 - Override:
ANTHROPIC_MODEL=claude-opus-4-5(or any available model) - SDK:
anthropicPython SDK,AsyncAnthropicclient - Get a key: https://console.anthropic.com
Claude is called via messages.create() with a system parameter containing the structured enrichment prompt. The response is the first content[0].text block.
AI_PROVIDER=qwen
QWEN_API_KEY=sk-...
- Default model:
qwen-plus— balanced capability and cost for structured JSON - Override:
QWEN_MODEL=qwen-max(highest capability) orqwen-turbo(fastest / cheapest) - SDK:
openaiPython SDK,AsyncOpenAIclient pointed at the Dashscope endpoint - Endpoint:
https://dashscope.aliyuncs.com/compatible-mode/v1(OpenAI-compatible) - Get a key: https://dashscope.aliyuncs.com
Because Dashscope's API is fully OpenAI-compatible, no Qwen-specific SDK is required. The openai SDK is used with a custom base_url. Qwen calls include response_format={"type": "json_object"} to encourage strict JSON output (supported by qwen-plus and qwen-max; remove this if using qwen-turbo).
| Anthropic Claude Sonnet | Qwen Plus | Qwen Max | |
|---|---|---|---|
| JSON reliability | Excellent | Very good | Excellent |
| Context window | 200k tokens | 131k tokens | 32k tokens |
| Speed | Fast | Fast | Moderate |
| Cost | $$ | $ | $$ |
| Best for | Production use | Cost-sensitive / China region | Maximum accuracy |
AI_PROVIDER=anthropic → AnthropicEnrichmentService (anthropic SDK)
AI_PROVIDER=qwen → QwenEnrichmentService (openai SDK + Dashscope)
(missing key) → StubEnrichmentService (OSV data + auto labels)
The factory function get_enrichment_service() in app/services/ai_enrichment.py returns a cached singleton. All three classes extend BaseEnrichmentService, which owns the JSON-parsing pipeline (direct parse → embedded-array extraction → stubs). Subclasses implement only _call_api(user_message) -> str.
If the selected provider's API key is absent or the API call fails at runtime, BaseEnrichmentService catches the exception and falls back to stub objects. Stubs are auto-populated from raw OSV data:
plain_english_explanation← OSVsummaryfieldseverity← derived from CVSS scoreurgency← derived from severity labelremediation←"Upgrade to the latest patched version."
The scan response is returned successfully — the user sees real CVE identifiers and CVSS scores even without AI enrichment.
OSV.dev (Open Source Vulnerabilities) is a free, open vulnerability database maintained by Google. It aggregates advisories from:
| Source | Ecosystems |
|---|---|
| GitHub Security Advisories (GHSA) | All GitHub-hosted packages |
| CVE Programme (NVD) | Cross-ecosystem CVEs |
| PyPI Advisory Database | Python / pip |
| npm Advisory Database | JavaScript / npm |
| RustSec | Rust / Cargo |
| Go Vulnerability Database | Go modules |
| OSS-Fuzz | C, C++ and more |
| …and many more | Maven, NuGet, Hex, Pub, etc. |
DepWatch calls POST https://api.osv.dev/v1/querybatch with all pinned dependencies in a single request body:
{
"queries": [
{ "package": { "name": "flask", "ecosystem": "PyPI" }, "version": "1.0.0" },
{ "package": { "name": "requests", "ecosystem": "PyPI" }, "version": "2.25.0" }
]
}OSV returns results in the same order as the queries. Each result is a list of OsvVulnerability objects, which may include:
id— OSV identifier (e.g.GHSA-xxxx-yyyy-zzzz) orCVE-YYYY-NNNNNaliases— cross-references including CVE IDs when the primary ID is a GHSAseverity— CVSS vector string (e.g.CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H)summary— one-line descriptiondetails— full advisory text
DepWatch extracts a numeric CVSS base score from the vector string using a three-level fallback: direct float parse → trailing number in vector → qualitative label mapping (HIGH → 7.5, CRITICAL → 9.5, etc.).
No API key is required. OSV.dev is fully free and open.
| Requirement | Minimum version | Notes |
|---|---|---|
| Git | any | — |
| Docker Desktop | 4.x | Includes Docker Compose v2 |
| Python | 3.12 | Only needed for local dev without Docker |
| Node.js | 20 LTS | Only needed for local dev without Docker (Reflex build pipeline) |
| Anthropic API key | — | Required if AI_PROVIDER=anthropic. Free tier available. |
| Qwen / Dashscope key | — | Required if AI_PROVIDER=qwen. Free tier available. |
This is the recommended way to run DepWatch. All three services (Postgres, backend, frontend) start with a single command.
# 1. Clone the repository
git clone https://github.com/yourname/depwatch.git
cd depwatch
# 2. Create your .env file from the template
cp .env.example .envOpen .env and set your AI provider:
# Option A — Anthropic Claude (default)
AI_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-your-key-here
# Option B — Alibaba Qwen
# AI_PROVIDER=qwen
# QWEN_API_KEY=sk-your-dashscope-key-here# 3. Build images and start all services
docker compose up --build
# To run in the background:
docker compose up --build -dWhat happens on first boot:
| Step | Duration | Detail |
|---|---|---|
| Postgres starts | ~3 s | Health check waits for pg_isready |
| Backend starts | ~5 s | FastAPI runs Base.metadata.create_all() on boot |
| Reflex compiles | ~60 s | Vite bundle is compiled once; cached in the reflex_web Docker volume on subsequent starts |
Once running:
| URL | What it is |
|---|---|
| http://localhost:3000 | DepWatch web UI |
| http://localhost:8000/docs | FastAPI interactive API docs (Swagger UI) |
| http://localhost:8000/redoc | FastAPI API docs (ReDoc) |
| http://localhost:8000/health | Liveness probe |
Useful Compose commands:
# View logs for a specific service
docker compose logs -f backend
docker compose logs -f frontend
# Stop all services (preserves data volumes)
docker compose down
# Stop and remove all data (full reset)
docker compose down -v
# Rebuild a single service after a code change
docker compose up --build backendUse this approach when you want faster iteration cycles or need to attach a debugger.
You still need Postgres running. The simplest way is to start just the DB container:
docker compose up db -d
# Postgres is now available on localhost:5432
# User: depwatch Password: depwatch Database: depwatchOr use an existing local Postgres instance and update DATABASE_URL in .env accordingly.
cd backend
# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install all dependencies (includes both anthropic and openai SDKs)
pip install -r requirements.txt
# Install the async SQLite driver used only in tests
pip install aiosqlite
# Copy and configure environment
cp ../.env.example .env
# Edit .env: set AI_PROVIDER and the corresponding API key
# Run database migrations
alembic upgrade head
# Start the development server with hot reload
uvicorn app.main:app --reload --port 8000The API is now running at http://localhost:8000. Interactive docs at http://localhost:8000/docs.
Open a second terminal:
cd frontend
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# First-time initialisation (creates the .web/ directory and installs npm deps)
reflex init
# Start the development server with hot reload
reflex runThe UI is now running at http://localhost:3000. Reflex hot-reloads on Python file saves.
# Check the backend health endpoint
curl http://localhost:8000/health
# → {"status": "ok", "version": "1.0.0"}
# Test a scan via curl (replace with your own requirements.txt)
curl -X POST http://localhost:8000/api/v1/scan/ \
-F "file=@/path/to/requirements.txt"All variables can be set in the .env file at the project root. Docker Compose reads them automatically. For local development without Docker, place .env in the backend/ directory.
| Variable | Default | Required | Description |
|---|---|---|---|
AI_PROVIDER |
anthropic |
No | Active AI provider. Options: anthropic, qwen |
| Variable | Default | Required | Description |
|---|---|---|---|
ANTHROPIC_API_KEY |
"" |
If AI_PROVIDER=anthropic |
Your Anthropic API key. Get one at https://console.anthropic.com |
ANTHROPIC_MODEL |
claude-sonnet-4-20250514 |
No | Claude model to use for enrichment |
| Variable | Default | Required | Description |
|---|---|---|---|
QWEN_API_KEY |
"" |
If AI_PROVIDER=qwen |
Your Dashscope API key. Get one at https://dashscope.aliyuncs.com |
QWEN_MODEL |
qwen-plus |
No | Qwen model. Options: qwen-turbo, qwen-plus, qwen-max, qwen-long |
QWEN_BASE_URL |
https://dashscope.aliyuncs.com/compatible-mode/v1 |
No | Dashscope OpenAI-compatible endpoint URL |
| Variable | Default | Required | Description |
|---|---|---|---|
DATABASE_URL |
postgresql+asyncpg://depwatch:depwatch@localhost:5432/depwatch |
Yes | SQLAlchemy async DSN. Use postgresql+asyncpg:// scheme |
Note:
docker-compose.ymloverridesDATABASE_URLwith the internal Docker network hostnamedb. You only need to set this manually for local dev without Docker.
| Variable | Default | Required | Description |
|---|---|---|---|
ENVIRONMENT |
development |
No | Set to production to suppress SQL echo logging |
LOG_LEVEL |
INFO |
No | Python logging level: DEBUG, INFO, WARNING, ERROR |
# .env — copy from .env.example and fill in your values
AI_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-api03-...
ANTHROPIC_MODEL=claude-sonnet-4-20250514
# QWEN_API_KEY=sk-...
# QWEN_MODEL=qwen-plus
# QWEN_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
DATABASE_URL=postgresql+asyncpg://depwatch:depwatch@localhost:5432/depwatch
ENVIRONMENT=development
LOG_LEVEL=INFOThe FastAPI backend exposes an OpenAPI spec at http://localhost:8000/docs (Swagger UI) and http://localhost:8000/redoc. All endpoints are prefixed with /api/v1 except /health.
Upload a dependency file and receive a full vulnerability report.
Request
Content-Type: multipart/form-data
| Field | Type | Description |
|---|---|---|
file |
UploadFile |
A requirements.txt or package.json. Maximum 5 MB. |
Response 201 Created
Error responses
| Status | Condition |
|---|---|
413 Request Entity Too Large |
File exceeds 5 MB |
422 Unprocessable Entity |
Unsupported file type, invalid JSON, or no pinned dependencies found |
502 Bad Gateway |
OSV.dev API returned an error or was unreachable |
Return a paginated list of past scans, newest first.
Query parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
limit |
integer (1–100) | 20 |
Number of items to return |
offset |
integer (≥ 0) | 0 |
Items to skip (for pagination) |
Response 200 OK
{
"items": [
{
"scan_id": "3fa85f64-...",
"filename": "requirements.txt",
"file_type": "requirements",
"total_dependencies": 24,
"vulnerable_count": 6,
"created_at": "2025-03-21T10:30:00Z"
}
],
"total": 42
}Return the full vulnerability report for a historical scan.
Path parameter
| Parameter | Type | Description |
|---|---|---|
scan_id |
UUID | The scan ID returned by POST /scan/ or listed in GET /history/ |
Response 200 OK — same shape as POST /api/v1/scan/
Error responses
| Status | Condition |
|---|---|
404 Not Found |
No scan with this ID exists |
Liveness probe for load balancers and Docker health checks.
Response 200 OK
{ "status": "ok", "version": "1.0.0" }The full test suite runs completely offline — SQLite in-memory replaces Postgres and all external HTTP calls (OSV.dev, Anthropic, Qwen) are intercepted by mocks. No API keys are required to run tests.
cd backend
# If not already done:
pip install -r requirements.txt
pip install aiosqlite # async SQLite driver for in-memory test DB# Run everything
pytest
# Verbose output with test names
pytest -v
# Run a single test module
pytest tests/test_parsers.py -v
pytest tests/test_osv_client.py -v
pytest tests/test_ai_enrichment.py -v
pytest tests/test_routes.py -v
# Run tests matching a keyword
pytest -k "qwen" -v
pytest -k "history" -v
# Stop on first failure
pytest -x
# Show slowest 10 tests
pytest --durations=10Pure unit tests; no I/O. Cover both parsers exhaustively.
| Test | What it verifies |
|---|---|
test_single_pinned_dependency |
Basic == pin is extracted |
test_multiple_pinned_dependencies |
Multiple pins in correct order |
test_comments_are_ignored |
# lines don't appear in output |
test_blank_lines_are_ignored |
Empty lines don't crash |
test_unpinned_dependency_is_skipped |
requests alone goes to skipped_lines |
test_range_constraint_is_skipped |
>= constraint goes to skipped_lines |
test_extras_are_stripped_from_name |
uvicorn[standard] → name uvicorn |
test_environment_markers_are_ignored |
; python_version>=... stripped |
test_vcs_url_is_skipped |
git+https://... goes to skipped_lines |
test_flag_line_is_skipped |
-r other.txt goes to skipped_lines |
test_empty_file_returns_empty_result |
No crash on empty input |
test_pre_release_version |
1.0.0b2 is a valid pinned version |
test_mixed_content |
All types together, correct counts |
test_exact_version_is_extracted (package.json) |
"4.18.2" → scanned |
test_caret_range_is_skipped |
"^4.17.21" → skipped |
test_dev_dependencies_are_included |
devDependencies entries scanned |
test_both_dep_groups_merged |
dependencies + devDependencies combined |
test_duplicate_package_deduped |
Package in both groups appears once |
test_equals_prefix_stripped |
"=7.5.4" → version 7.5.4 |
test_invalid_json_sets_parse_error |
Malformed JSON sets parse_error |
test_prerelease_semver_accepted |
"14.0.0-canary.1" is valid |
test_query_batch_returns_pairs Happy path, two deps one vuln each
test_query_batch_empty_input_returns_empty No HTTP call made
test_query_batch_no_vulns Empty vuln list for a safe package
test_query_batch_order_preserved Output order matches input order exactly
test_osv_http_error_raises HTTP 500 propagates as HTTPStatusError
TestExtractCvssScore::test_plain_float_score "8.1" → 8.1
TestExtractCvssScore::test_no_severity_returns_none [] → None
TestExtractCvssScore::test_qualitative_high_maps_to_score "HIGH" → 7.5
TestExtractCvssScore::test_trailing_number_in_vector CVSS vector/9.8 → 9.8
Uses pytest-httpx to intercept all httpx calls at the transport layer.
Structured into four classes:
TestSharedEnrichmentLogic — tests the parsing pipeline once, provider-agnostically:
test_clean_json_array_is_parsed
test_json_embedded_in_prose_is_extracted
test_malformed_response_returns_stubs
test_empty_osv_pairs_returns_empty_no_api_call
test_empty_string_response_returns_stubs
test_urgency_derived_when_missing_from_response
test_multiple_packages_all_returned
test_api_exception_falls_back_to_stubs
TestAnthropicProvider:
test_anthropic_happy_path
test_anthropic_no_key_returns_empty_string
test_anthropic_uses_configured_model
TestQwenProvider:
test_qwen_happy_path
test_qwen_no_key_returns_empty_string
test_qwen_uses_configured_model
test_qwen_system_prompt_passed_correctly
test_qwen_json_object_wrapper_handled
test_qwen_no_choices_returns_stubs
TestGetEnrichmentServiceFactory:
test_factory_returns_anthropic_service_by_default
test_factory_returns_qwen_service_when_configured
test_factory_is_cached
test_reset_clears_cache
Full HTTP round-trip tests using httpx.AsyncClient with ASGI transport. External calls (OSV, AI) are mocked at the service layer.
TestScanEndpoint::
test_scan_requirements_returns_201
test_scan_package_json_returns_201
test_scan_unsupported_file_type_returns_422
test_scan_invalid_json_returns_422
test_scan_no_pinned_deps_returns_422
test_scan_response_has_summary_fields
test_scan_osv_failure_returns_502
test_scan_persists_to_db ← verifies actual DB row creation
TestHistoryEndpoint::
test_history_returns_200
test_history_pagination
test_history_detail_not_found
test_history_detail_returns_scan
TestHealthEndpoint::
test_health_returns_ok
Why SQLite instead of Postgres for tests? Running Postgres in CI requires a service container, adds ~30 seconds of startup overhead, and couples the test suite to infrastructure. SQLite (in-memory via aiosqlite) is structurally identical for every query DepWatch runs. Postgres-specific behaviour — UUIDs, timezone-aware timestamps, cascade deletes — is validated by the Alembic migration file and integration tests.
Why mock at the service layer, not the HTTP layer, for route tests? Mocking get_osv_client() and get_enrichment_service() return values is faster, clearer, and more stable than intercepting HTTP calls from inside a full request cycle.
Why reset_enrichment_service() in factory tests? get_enrichment_service() caches a singleton. Tests that need to control which provider is returned must clear the cache in setup_method and teardown_method to avoid cross-test pollution.
The workflow at .github/workflows/ci.yml runs on every push to main or develop and on every pull request targeting main.
test — runs on ubuntu-latest, Python 3.12:
- Check out repository
- Set up Python with
pipcache keyed onrequirements.txt pip install -r requirements.txt && pip install aiosqlitepytest tests/ -v --tb=short
No Postgres service container is required. DATABASE_URL is set to a dummy Postgres URL (never actually connected) and the test session uses the SQLite override from conftest.py.
lint — runs in parallel with test:
pip install ruff==0.9.1ruff check app/ tests/ruff format --check app/ tests/
Add this to your fork's README:
[](https://github.com/yourname/depwatch/actions/workflows/ci.yml)scans
| Column | Type | Notes |
|---|---|---|
id |
UUID PK |
uuid4() default |
filename |
VARCHAR(255) |
Original upload filename |
file_type |
VARCHAR(50) |
"requirements" or "package_json" |
total_dependencies |
INTEGER |
Count of pinned deps parsed |
vulnerable_count |
INTEGER |
Unique packages with ≥1 CVE |
created_at |
TIMESTAMPTZ |
Server-side now() default |
vulnerabilities
| Column | Type | Notes |
|---|---|---|
id |
UUID PK |
uuid4() default |
scan_id |
UUID FK → scans.id |
ON DELETE CASCADE |
package_name |
VARCHAR(255) |
|
package_version |
VARCHAR(100) |
|
cve_id |
VARCHAR(100) |
CVE-YYYY-NNNNN or GHSA ID |
cvss_score |
FLOAT |
Nullable — not all OSV entries include a score |
severity |
VARCHAR(50) |
Critical / High / Medium / Low / Unknown |
plain_english_explanation |
TEXT |
AI-generated. Nullable |
exploit_scenario |
TEXT |
AI-generated. Nullable |
remediation |
TEXT |
AI-generated. Nullable |
urgency |
VARCHAR(50) |
Immediate / Soon / Low Priority. Nullable |
Indexes: ix_scans_created_at on scans.created_at; ix_vulnerabilities_scan_id on vulnerabilities.scan_id.
cd backend
# Apply all pending migrations
alembic upgrade head
# Check current revision
alembic current
# Generate a new migration from ORM model changes
alembic revision --autogenerate -m "add remediation_url column"
# Roll back one revision
alembic downgrade -1On application startup, app/main.py calls Base.metadata.create_all() via the lifespan handler. This is idempotent — it creates tables that don't exist and does nothing for tables that do. It is not a replacement for Alembic migrations (which handle column additions, renames, and index changes), but it ensures the app works on first run from Docker Compose without a manual migration step.
For production deployments, run alembic upgrade head as an init container or as part of your deployment pipeline before starting the application.
A single POST /v1/querybatch replaces N individual POST /v1/query calls. For a 30-dependency requirements.txt this cuts OSV round-trips from 30 to 1, reducing scan latency by ~95% and being considerate to OSV's free infrastructure. The client auto-chunks at 100 items to stay safely below OSV's documented 1 000-item limit.
All CVEs from a scan are sent to the AI provider in one call rather than one call per CVE. For a file with 8 vulnerabilities, this is 8× cheaper and faster. Claude's 200k and Qwen's 131k context windows comfortably accommodate even large scans. At 50+ CVEs the client makes sequential sub-batch calls to stay within typical output token limits.
BaseEnrichmentService owns all JSON parsing, validation, and stub logic. AnthropicEnrichmentService and QwenEnrichmentService implement only _call_api(user_message) -> str. Adding a third provider (e.g. Gemini, Mistral, Ollama) requires only a new subclass and a one-line addition to the factory function — no changes to the shared parsing pipeline.
SQLAlchemy models define the database schema; Pydantic schemas define the HTTP contract. A DB column rename doesn't break the API response shape. An API response field addition doesn't require a migration. The two layers can evolve at different paces.
Sequential integer IDs expose row counts (/history/5 implies there are at least 5 scans) and make enumeration attacks trivial. UUIDs are safe to expose in URLs, API responses, and logs.
FastAPI serialises the response object immediately after the route handler returns, inside the same get_db session scope. With SQLAlchemy's default (expire_on_commit=True), attributes are expired after commit(). Accessing them would trigger implicit lazy loads on an async session, causing MissingGreenlet errors. expire_on_commit=False keeps all loaded attributes available for the lifetime of the request.
Postgres-in-CI requires a service container, extends pipeline time, and tightly couples test infrastructure to the database engine. SQLite (in-memory, via aiosqlite) is structurally identical for every query DepWatch executes. Postgres-specific behaviour is verified by the Alembic migration and the Docker Compose integration environment.
The get_db dependency commits inside try and rolls back inside except. Route handlers call await db.flush() (not commit()) to write rows within the transaction, letting the dependency control the transaction boundary. This prevents partially-written scans if the serialisation step raises after the insert.
VULNERABILITY_ENRICHMENT_SYSTEM_PROMPT is a module-level string constant in services/prompts.py, not an f-string or a database record. This makes it reviewable in code review, diffable in git, and importable in tests. The template is provider-agnostic — it instructs the model in terms of input/output JSON schemas, not in provider-specific syntax.
The first reflex run (or docker compose up --build) compiles the Vite bundle, which downloads npm packages and runs the build. This takes 60–120 seconds on a cold start. Subsequent starts use the cached .web/ directory (or the reflex_web Docker volume) and complete in ~5 seconds.
If it seems stuck, check the container logs:
docker compose logs -f frontendLook for the line App running at: http://localhost:3000 — that's when compilation is complete.
This is expected behaviour when the key is missing. The application will start and scans will succeed, but vulnerability reports will use stubs rather than AI-generated enrichment. To enable full enrichment, add your key to .env and restart the backend.
This appears in the plain_english_explanation field when the AI call fails or no key is configured. The scan result is still valid — OSV data (CVE IDs, CVSS scores) is present and correct. Check:
- Is
ANTHROPIC_API_KEYorQWEN_API_KEYset in.env? - Does
AI_PROVIDERmatch the key you've provided? - Check backend logs:
docker compose logs backend | grep "API error"
DepWatch only scans exact versions. If your file contains only range constraints:
# requirements.txt with ranges — will return 422
flask>=3.0.0
requests~=2.28
Generate a pinned file:
pip freeze > requirements-pinned.txtOr for npm:
npm shrinkwrap # generates npm-shrinkwrap.json (rename to package.json)OSV.dev data is updated continuously but may lag new CVE publications by hours. Also verify:
- The package name exactly matches the PyPI or npm registry name (case-insensitive for PyPI, exact-case for npm)
- The version is genuinely affected — a patched version may correctly show no vulnerabilities
The backend waits for Postgres to pass its health check before starting (via the depends_on: condition: service_healthy setting in Compose). If you see connection errors, check:
docker compose ps
# Verify the `db` service shows "(healthy)"
docker compose logs db
# Look for "database system is ready to accept connections"Default ports: 3000 (frontend), 3001 (Reflex internal), 8000 (backend), 5432 (Postgres). To change them, edit docker-compose.yml and frontend/rxconfig.py.
Contributions are welcome. Please follow this workflow:
# 1. Fork and clone
git clone https://github.com/yourname/depwatch.git
cd depwatch
# 2. Create a feature branch
git checkout -b feat/your-feature-name
# 3. Make changes and add tests
# 4. Verify locally
cd backend
pip install -r requirements.txt aiosqlite
pytest -v # all tests must pass
ruff check app/ tests/ # no lint errors
ruff format app/ tests/ # code must be formatted
# 5. Push and open a pull request
git push origin feat/your-feature-nameCI will run automatically on your PR. Merges to main require passing tests and lint.
- Create a subclass of
BaseEnrichmentServiceinapp/services/ai_enrichment.py:class GeminiEnrichmentService(BaseEnrichmentService): async def _call_api(self, user_message: str) -> str: # call Gemini API, return raw text response ...
- Add a new
Literaloption toai_providerinapp/config.py - Add any provider-specific settings fields to
Settings - Register the new class in
get_enrichment_service() - Add tests in
tests/test_ai_enrichment.pyfollowing theTestQwenProviderpattern - Update
.env.exampleand this README
MIT — see LICENSE for full text.



{ "scan_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "filename": "requirements.txt", "file_type": "requirements", // "requirements" | "package_json" "created_at": "2025-03-21T10:30:00Z", "summary": { "total_dependencies": 24, "vulnerable_count": 6, "immediate_count": 3, "soon_count": 2, "low_priority_count": 1, "critical_count": 1, "high_count": 2, "medium_count": 2, "low_count": 1 }, "vulnerabilities": [ { "id": "7d8e9f10-...", "package_name": "flask", "package_version": "1.0.0", "cve_id": "CVE-2023-30861", "cvss_score": 7.5, "severity": "High", "plain_english_explanation": "Flask before 2.3.2 mishandles the Vary header...", "exploit_scenario": "An attacker and victim share a reverse-proxy cache...", "remediation": "Upgrade to flask==2.3.3 or later.", "urgency": "Immediate" // "Immediate" | "Soon" | "Low Priority" } // ...one object per CVE ] }